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Abstract 

We present the first explicit connection between quantum computation and lattice problems. 
Namely, we show a solution to the Unique Shortest Vector Problem (SVP) under the assumption 
that there exists an algorithm that solves the hidden subgroup problem on the dihedral group 
by coset sampling. Moreover, we solve the hidden subgroup problem on the dihedral group by 
using an average case subset sum routine. By combining the two results, we get a quantum 
reduction from 0(rt 2 ' 5 )-unique-SVP to the average case subset sum problem. 

1 Introduction 

Quantum computation is a computation model based on quantum physics. Assuming that the 
laws of nature as we know them are true, this might allow us to build computers that are able 
to perform tasks that classical computers cannot perform in any reasonable time. One task which 
quantum algorithms are known to perform much better than classical algorithm is that of factoring 
large integers. The importance of this problem stems from its ubiquitous use in cryptographic 
applications. While there are no known polynomial time classical algorithms for this problem, 
a groundbreaking result of Shor from 1994 |24j showed a polynomial time quantum algorithm 
for factoring integers. In the same paper, Shor showed an algorithm for finding the discrete log. 
However, despite enormous effort, we have only a few other problems for which quantum algorithms 
provide an exponential speedup (e.g., ^2 EE])- Other notable quantum algorithms such as Deutsch 
and Jozsa's algorithm [3] and Simon's algorithm [23] operate in the black box model. Grover's 
algorithm ^0] provides a square root speedup over classical algorithms. 

The current search for new quantum algorithms concentrates on problems which are not known 
to be iVP-hard. These include the graph isomorphism problem and lattice problems. In this paper 
we are interested in lattice problems or specifically, the unique shortest vector problem (SVP). 
A lattice is a set of all integral linear combinations of a set of n linearly independent vectors 
in M. n . This set of n vectors is known as a basis of the lattice. In the SVP we are interested in 
finding the shortest nonzero vector in a lattice. In the /(n)-unique-SVP we are given the additional 
promise that the shortest vector is shorter by a factor of at least f(n) from all other non parallel 
vectors. This problem also has important applications in cryptography. Namely, Ajtai and Dwork's 
cryptosystem [2] and the recent cryptosystem by Regev |22j are based on the hardness of this lattice 
problem. 

A central problem in quantum computation is the hidden subgroup problem (HSP). Here, we 
are given a black box that computes a function on elements of a group G. The function is known to 
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be constant and distinct on left cosets of a subgroup H ^ G and our goal is to find H. Interestingly, 
almost all known quantum algorithms which run super-polynomially faster than classical algorithms 
solve special cases of the HSP on Abelian groups. Also, it is known that solving the HSP on the 
symmetric group leads to a solution to graph isomorphism |14j . This motivated research into 
possible extensions of the HSP to noncommutative groups (see, e.g., [HI E3 1231 U\ ) ■ However, prior 
to this paper the HSP on groups other than the symmetric group and Abelian groups had no known 
applications. 

In this paper we will be interested in the HSP on the dihedral group. The dihedral group of 
order 2N, denoted Djy, is the group of symmetries of an A r -sided regular polygon. It is isomorphic 
to the abstract group generated by the element p of order n and the element r of order 2 subject 
to the relation pr = Tp _1 . Although the dihedral group has a much simpler structure than the 
symmetric group, no efficient solution to the HSP on the dihedral group is known. Ettinger and 
H0yer [H] showed that one can obtain sufficient statistical information about the hidden subgroup 
with only a polynomial number of queries. However, there is no efficient algorithm that solves the 
HSP using this information. Currently, the best known algorithm is due to Kuperberg ^7] and 
runs in subexponential time 2°^ log N \ 

The following is the main theorem of this paper. The dihedral coset problem is described in 
the following paragraph. 

Theorem 1.1 // there exists a solution to the dihedral coset problem with failure parameter f then 
there exists a quantum algorithm that solves the Q(n2 +2f )-unique-SVP. 

The input to the dihedral coset problem (DCP) is a tensor product of a polynomial number of 
registers. Each register is in the state |0, x) + |1, [x + d) mod N) for some arbitrary x E {0, . . . , N — 
1} and d is the same for all registers. These can also be thought of as cosets of the subgroup 
{(0, 0), (1, d)} in Djy. Our goal is to find the value d. In addition, we say that the DCP has a 
failure parameter f if each of the registers with probability at most ^ ^ N y is i n the state \b,x) for 
arbitrary b, x instead of a coset state. We note that any algorithm that solves the dihedral HSP 
by sampling cosets also solves the DCP for some failure parameter f. The reason is that since 
the algorithm samples only a polynomial number of cosets, we can take f to be large enough such 
that with high probability all the registers are coset states. This is summarized in the following 
corollary. 

Corollary 1.2 If there exists a solution to the dihedral HSP that samples cosets (e.g., any solution 
using the 'standard method') then there exists a quantum algorithm that solves poly (n) -unique- S VP. 

The following is the second main theorem of this paper. In the subset sum problem we are given 
two integers t, N and a set of numbers. We are asked to find a subset of the numbers that sums to 
t modulo N. A legal input is an input for which such a subset exists (a formal definition appears 
in Section 0J) and we are interested in algorithms that solve a non-negligible part of the inputs: 

Theorem 1.3 If there exists an alqorithm S that solves — rir — kts of the leqal subset sum inputs 



with parameter N then there exists a solution to the DCP with failure parameter f = 1. 

As shown in the dihedral HSP can be reduced to the case where the subgroup is of the form 
{(0,0), (1, d)}. Then, by sampling cosets we obtain states of the form |0, x) + |1, (x + d) mod N) 
with no error. Hence, 





of the legal subset sum inputs 
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Finally, the following is an immediate corollary of the two previous theorems: 

Corollary 1.5 If there exists an algorithm that solves po i y ^ og -m of the legal subset sum inputs with 
parameter N then there exists a quantum algorithm for the Q(n 2 5 )-unique-SVP. 

This result is known as a worst case to average case quantum reduction. Such reductions are already 
known in the classical case [TT 131 HI IT§1 I22| . The exponent 2.5 in our reduction is better than the 
one in pJIBHUEij- However, the reduction in [22], which appeared after the original publication 
of the current paper, further improves the exponent to 1.5 and hence subsumes our reduction. In 
addition, unlike the classical reductions, our subset sum problems have a density of one, i.e., the 
size of the input set is very close to logiV. Therefore, some cryptographic applications such as the 
one by Impagliazzo and Naor JH] cannot be used. 

Intuitive overview 

Before proceeding to the main part of the paper, we describe our methods in a somewhat intuitive 
way. First, let us describe the methods used in solving the unique-SVP. Recall that our solution 
is based on a solution to the DCP. We begin by showing how such a solution can be used to solve 
a slightly different problem which we call the two point problem. Instead of a superposition of 
two numbers with a fixed difference, our input consists of registers in a superposition of two n- 
dimensional vectors with a fixed difference. Then, the idea is to create an input to the two point 
problem in the following way. Start by creating a superposition of many lattice points and collapse 
the state to just two lattice points whose difference is the shortest vector. Repeating this procedure 
creates an input to the two point problem whose solution is the shortest vector. 

Collapsing the state is performed by partitioning the space into cubes. Assume the partition 
has the property that in each cube there are exactly two lattice points whose difference is the 
shortest vector. Then, we compute the cube in which each point is located and measure the result. 
The state collapses to a superposition of just the two points inside the cube we measured. The 
important thing is to make sure that exactly two points are located in each cube. First, in order to 
make sure that the cubes are not aligned with the lattice, we randomly translate them. The length 
of the cubes is proportional to the length of the shortest vector. Although the exact length of the 
shortest vector is unknown, we can try several estimates until we find the right value. Since the 
lattice has a unique shortest vector, all other nonpar allel vectors are considerably longer and do not 
fit inside a cube. Therefore we know that the difference between any two points inside the same 
cube is a multiple of the shortest vector. Still, this is not good enough since instead of two points 
inside each box we are likely to have more points aligned along the shortest vector. Hence, we space 
out the lattice: instead of creating a superposition of all the lattice points we create a superposition 
of a subset of the points. The set of points created by this technique has the property that along 
the direction of the shortest vector there are pairs of points whose difference is the shortest vector 
and the distance between two such pairs is much larger than the shortest vector. As before, this 
can be done without knowing the shortest vector by trying several possibilities. 

The second part of the paper describes a solution to the DCP with failure parameter 1 which 
uses a solution to the average case subset sum problem. Recall that we are given registers of 
the form |0, x) + |1, (x + d) mod N) where x S {0, . . . , N — 1} is arbitrary and we wish to find 
d £ {0, . . . , N — 1}. Consider one such register. We begin by applying the Fourier transform to the 
second part of the register (the one holding x and x + d) and then measuring it. If a is the value we 
measured, the state collapses to a combination of the basis states |0) and |1) such that their phase 
difference is 2-7r^. If we were lucky enough to measure a = 1, then the phase difference is 27r-^ 
and by measuring this phase difference we can obtain an estimation on d. This, however, happens 
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with exponentially small probability. Since the phase is modulo 2ir, extracting the value d is much 
harder when a is larger. Instead, we perform the same process on r registers and let 01, . . . , a r be 
the values we measure. The resulting tensor state includes a combination of all 2 r different 0, 1 
sequences. The phase of each sequence can be described as follows. By ignoring a fixed phase, we 
can assume that the phase of the sequence 00 ... is 0. Then, the phase of the sequence 100 ... 
is 2tt^ and in general, the phase of the sequence a\a<i . . . a r is 2tt-^ multiplied by the sum of the 
values en for which a« = 1. This indicates that we should try to measure the phase difference of 
two sequences whose sums differ by 1. However, although we can estimate the phase difference of 
one qubit, estimating the phase difference of two arbitrary sequences is not possible. 

We proceed by choosing r to be very close to log N . This creates a situation in which for almost 
every t £ {0, ... ,N — 1} there is a subset whose sum modulo N is t and in addition, there are 
not too many subsets that sum to the same t modulo N. Assume for simplicity that every t has 
exactly one subset that sums to t modulo N. We calculate for each sequence the value |_§J where 
t is its sum. After measuring the result, say s, we know that the state is a superposition of two 
sequences: one that sums to 2s and one that sums to 2s + 1. Notice that since a±, . . . ,a r are 
uniformly chosen between {0, . . . , N — 1} we can use them as an input to the subset sum algorithm. 
The key observation here is that the subset sum algorithm provides the reverse mapping, i.e., from 
a value t to a subset that sums to t. So, from s we can find the sequence a.\ that sums to 2s and 
the sequence 02 that sums to 2s + 1. Since we know that the state is a superposition of |ai) and 
\oi2) we can use a unitary transformation that transforms \ot\) to |0) and \a.2j to |1). Now, since the 
two states differ in one qubit, we can easily measure the phase difference and obtain an estimate 
on d. This almost completes the description of the DCP algorithm. The estimate on d is only 
polynomially accurate but in order to find d we need exponential accuracy. Hence, we repeat the 
same process with pairs whose difference is higher. So, instead of choosing pairs of difference 1 we 
choose pairs of difference 2 to get an estimate on 2d, then 4 to get an estimate on 4d and so on 1 . 

Outline 

The next section contains some notations that are used in this paper. The two main sections of 
this paper are independent. In Section |H1 we prove Theorem 11.11 and Section 0] contains the proof 
of Theorem 11.31 

2 Preliminaries 

We denote the imaginary unit by 1 and use the notation e(x) = e 2mx . Occasionally, we omit the 
normalization of quantum states. We use the term ra-ball to refer to the n-dimensional solid body 
and the term sphere to refer to its surface. We denote the set {1, ... ,n} by [nj. All logarithms 
are of base 2 unless otherwise specified. We use Sij to denote the Kronecker delta, i.e., 1 if i = j 
and otherwise. A sequence a S {0, l} r is identified with the set {« | a» = 1}. Several constants 
appear in our proofs. To make it easier to follow, we denote constants with a subscript that is 
somewhat related to their meaning. Specifically, in Section El c cu b is related to the cubes that 
partition the space, c\> 3 \ is related to the radius of the balls, and c unq appears in the guarantee of 
the unique shortest vector. Also, in Section 0] we use c r in the definition of the parameter r, Cg in 
our assumptions on the subset sum subroutine and c m when we prove the existence of matchings. 
The following is the formal definition of the DCP: 

lr rhis description is very similar to the method of exponentially accurate phase estimation used in Kitaev's algo- 
rithm 16 . Actually, our case is slightly more difficult because we cannot measure all the multiples 2 ! . Nevertheless, 
we can measure enough multiples of the phase to guarantee exponential accuracy. 
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Definition 2.1 The input to the DCP with failure parameter f consists of poly (log N) registers. 
Each register is with probability at least 1 — n~jy\r ^ n the state 4=(|0,:c) + |1, (x + d) mod N)) on 
1 + [log N~\ qubits where x E {0, . . . , N— 1} is arbitrary and d is fixed. Otherwise, with probability at 
most n g N y > its state is \b, x) where b E {0, 1} and x E {0, . . . , N — 1} are arbitrary. We call such 
a register a 'bad' register. We say that an algorithm solves the DCP if it outputs d with probability 
P°^(e|jv) and time poly (log N). 



3 A Quantum Algorithm for unique-SVP 

In this section we prove Theorem 11.11 We begin by showing a simple reduction from the two 
point problem to the DCP in Section 13,11 We then prove a weaker version of Theorem 11.11 with 
0(n 1+2f ) instead of @(n2 +2f ) in Section 13.21 We complete the proof of Theorem 11.11 in Section 
13.31 Throughout this section, we use a failure parameter f > in order to make our results more 
general. The reader might find it easier to take f = 1. 



3.1 The Two Point Problem 

Definition 3.1 The input to the two point problem with failure parameter f consists of poly (n log M) 
registers. Each register is with probability at least 1 — j- log (" 2 M)) f ^ n s t a t e (| 0, a) + |l,o')) on 
l + Ti [log M~\ qubits where a, a 1 G {0, . . . , M — l} n are arbitrary such that a' — a is fixed. Otherwise, 
with probability at most ( ra i g(2M)) f ' ^ s * s ^) where b G {0, 1} and a G {0, . . . , M — l} n are 

arbitrary. We say that an algorithm solves the two point problem if it outputs a' — a with probability 
P° l y( nlogM ) and time poly (n log M). 

Lemma 3.2 If there exists an algorithm that solves the DCP with failure parameter f then there 
is an algorithm that solves the two point problem with failure parameter f . 

Proof: Consider the following mapping from {0, . . . , M — l} n to {0, ... , (2M) n — 1}: 

/(ai, . . . , O = aa + a 2 • 2M + . . . + a n (2M) n - 1 . 

Given an input to the two point problem, we create an input to the DCP by using the above 
mapping on the last n[log M~\ qubits of each register. Hence, each register is with probability at 
least 1 t t-, „, n \( in state 

(n(log 2M)y 

-L(|0,/(a)) + |l,/(a'))). 

The difference /(a') -/(a) is {a' 1 -a 1 ) + (a' 2 -a 2 )-2M + . . . + (a' n -a n )(2M) n - 1 and is therefore fixed. 
Otherwise, with probability at most ( n ( log 1 2 jv/))f * ne re gist er is i n tli e state \b,f(a)) for arbitrary 
6, a. This is a valid input to the DCP with ./V = (2M) n since the probability of a bad register is at 
most 



(n log(2M)) f (log NY ' 

Using the DCP algorithm with the above input we obtain the difference b\ + 62 • 2M + . . . + 
b n (2M) n - 1 where b { = a\ - a,. In order to extract the Vs we add M + M ■ 2M + M(2M) 2 + ...+ 
M(2M) n ~ l . Extracting 6; from (61 + M) + (b 2 + M) ■ 2M + . . . + (b n + M)(2M) n - 1 is possible since 
each bi + M is an integer in the range 1 to 2M — 1. The solution to the two point problem is the 
vector (pi, ... , b n ). ■ 
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3.2 A Weaker Algorithm 

We recall several facts about an LLL-reduced basis. Such a basis can be found for any lattice by 
using a polynomial time algorithm jXHJ . Given a basis (61, . . . ,b n ), let (6*, . . . ,6^) be its Gram- 
Schmidt orthogonalization. That is, b* is the component of bi orthogonal to the subspace spanned 
by 61, ... , 6i— 1 - An LLL reduced basis (b\, . . . ,b n ) satisfies that ||b*|| < v2||6| +1 || and that for 
* > 3i \(bi,b^}\ < f||^|| 2 - I n addition, recall that mim. ||6*|| is a lower bound on the length of 
the shortest vector. Since b\ = b\ and < 2( i ~ 1 )/ 2 ||fe*|| we get that the vector b\ is at most 
2( ra_1 )/ 2 times longer than the shortest vector. Consider the representation of the LLL basis in the 

orthonormal basis {Jjhr, ■ ■ ■ , jnSii)- The vector bi can be written as (bu, b-a, • • • , bu, 0, . . . , 0). Notice 

II "1 I 1 II "n II 

that bu = \\b*\\ and that < for every i > j. In the following, u denotes the shortest 

vector. 

Lemma 3.3 Consider the representation of the shortest vector u in the LLL-reduced lattice basis 
u = Yl7=l u ibi- Then, \ui\ < 2 2n for i £ [n]. 

Proof: Changing to the orthonormal basis, u = Y^i=x u ^i = Sr=i(Sj=i n i^i,«)pi][- ^ n addition, 
we know that ||6*|| > 2-( J - 1 )/ 2 ||^|| > 2- n ||n||. Hence, | YJj=i u j h jA < 2 n ||&*|| for every i e [n]. By 
taking % = n we get that \u n \ is at most 2 n . We continue inductively and show that \v,k\ < 2 2n_fc . 
Assume that the claim holds for Uk+i, ■ ■ ■ , u n . Then, | Y^j=k+i u jbj,k\ < || X^jLjfc+i u j\ ll^fell — 
UT:]=k+i 22n ~ 3 W k \\ < \ ■^ 2n ~ k \Hl % the triangle inequality, \u k b Kk \ < \E]= k+ i u jbj,k\ + 
I J2]=k u jhk\ < {\2 2n - k + 2 n )||^|| < 2 2n " fe ||^|| and the proof is completed. ■ 

Let p > n 2+2f be any fixed prime. The following is the main lemma of this section: 

Lemma 3.4 For any f > let u = Y27=i u ^ ) i ^ e ^ e shortest lattice vector in a (c un q72^~^ 2 ^) -unique 
lattice where c unq > is a constant. If there exists a solution to the two point problem with failure 
parameter f then there exists a quantum algorithm that given this lattice and three integers l,m,iQ 
returns {u\, . . . ,tij _i, — ^ — ,Uj 0+ i, . . . ,u n ) with probability l/poly(n) if the following conditions 
hold: \\u\\ < I < 2||n||, Uj = m (mod p) and 1 < m < p — 1. 

We first show how this lemma implies Theorem 11.11 with 0(n 1+2f ) by describing the SVP 
algorithm. According to Lemma 13.21 and the assumption of the theorem, there exists a solution to 
the two point problem with failure parameter f . Hence, Lemma 13.41 implies that there exists an 
algorithm that given the right values of I, m, iq outputs [u\, . . . , itj _i, — , Mj +i, . . . , u n ). The 
value I is an estimate of the length of the shortest vector u. Because the LLL algorithm gives a 
2( n_1 )/ 2 -approximation to the length of the shortest vector, one of (n — l)/2 different values of / is 
as required. In addition, since u is the shortest vector, u/p cannot be a lattice vector and therefore 
there exists an i$ such that Ui ^ (mod p). Hence, there are only 0(pn 2 ) possible values for 
l,m and iq. With each of these values the SVP algorithm calls the algorithm of Lemma 13.41 a 
polynomial number of times. With high probability in one of these calls the algorithm returns the 
vector (ui, . . . , Ui -i, — , Ui +i, ■ ■ ■ , u n ) from which u can be extracted. The results of the other 
calls can be easily discarded because they are either longer lattice vectors or non-lattice vectors. 

Proof: (of Lemma 13. 4[) We start by applying the LLL algorithm to the unique lattice in order to 
create a reduced basis. Denote the resulting basis by (61, ... , b n ). Let (ei, . . . , e n ) be the standard 
orthonormal basis of W 1 . 
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Let w\, . . . ,w n be n real values in [0, 1) and let M = 2 4n . Assume without loss of generality 
that io = 1. The function / is defined as f(t,a) = {a\p + tm)b\ + Y17=2 ai ^ >i where t £ {0, 1} and 
a = (a±, . . . , a n ) £ A = {0, . . . , M — 1}™. It maps the elements of {0, 1} x A to lattice points. 
In addition, consider a lattice vector v represented in the orthonormal basis v = Y^=i v i^i- The 
function g maps v to the vector ([f i/(c cu b"- 3+2f ■ — wi\ , . . . , \y n j (c cu bW2 +2f • /) — w n \ ) in Z n where 
the constant c cu b > will be specified later. 

In the following, we describe a routine that creates one register in the input to the two 
point problem that hides the difference (u±, . . . ,Uj _i, — , iii +i, • • • ,u n ). We call the routine 
poly(n log M) = poly{n) times in order to create a complete input to the two point problem. We 
then call the two point algorithm and output its result. This completes the proof of the lemma 
since with probability l/poly(n log M) = l/poly(n) our output is correct. 

The routine starts by choosing wi, . . . , w n uniformly from [0, 1). We create the state 

\/2M n ^ 1*'^" 
VZM te{0,i},aeA 

Then, we compute the function F = gof and measure the result, say ri,...,r n . The state collapses 
to (normalization omitted) 

\t,a}\n,...,r n ). 

t e {o, i} s, e A 

F(t, a) = (ri, . . . , !-„) 

This completes the description of the routine. Its correctness is shown in the next two claims. 

Claim 3.5 For every f £ Z ra , there is at most one element of the form (0, a) and at most one 
element of the form (l,a') that get mapped to f by F. Moreover, if both (0, a) and (l,a') get 
mapped to f then a' — a is the vector ( Ul ~ m ,U2, ■ ■ ■ , u m ). 

Proof: Consider two different lattice points in the image of /, v = f(t, a) and v' = f(t', a'), that get 
mapped to f by g. Let v = Yl7=i v ^ anc ^ ^' = Yl7=i ^ e their representation in the orthonormal 
basis. If v 1 — v is not a multiple of the shortest vector, then \\v' — v \\ > c unq n 1+2f ||u|| > ^c unq n 1+2f • I. 

Therefore, there exists a coordinate i G [n] such that \v[ — Vi\> ^c unq n2 +2f • / and for c unq > 2c cu b 
this implies g(v) / g(v') no matter how wi, ... ,w n are chosen. Hence, v' — v = k • u for some 
integer k ^ 0. By considering the first coordinate of v' — v in the lattice basis we get that 
(a'lP + t'm) — (aip + tm) = k ■ m (mod p). This implies that k = t! — t (mod p). If t = t' 
then k = (mod p) which implies that \k\ > p. Thus, \\v' — v\\ > p\\u\\ > c cu bW 1+2f • / and again, 
g(v) / g{v'). This proves the first part of the claim. For the second part, let t = and t' = 1. 
Then, k = 1 (mod p). As before, this can only happen when k = 1 and hence the second part of 
the claim holds. ■ 

Hence, it is enough to show that the probability that this register is bad is low enough. The 
probability of measuring \n, . . . ,r n ) equals t^jtt ■ \{(t, a) \ F(t,a) = (n, . . . , r n )}\. Notice that this 
probability is the same as the probability that F(t,a) = (ri, . . . ,r n ) for randomly chosen t and a. 
Hence, we consider a randomly chosen t and a. If t = 0, let a' = (ai + Ml ~ m , 02 + U2, ■ ■ ■ , a n + u n ) 
and if t = 1 let a! = (a\ - U1 ~ m , a 2 — u 2 , ■ ■ ■ , a n — u n ). 

Claim 3.6 With probability at least 1 — ( n i og (2M)) ( ' f or ran domly chosen t and a, a' is in A and 
F(l - t,a') = F(t,a). 
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Proof: We assume that t = 0, the proof for t = 1 is similar. According to Lemma 13.31 \ui\ < 2 2n . 
Hence, unless there exists an i for which a» < 2 2n or a%> M. — 2 2n , a! is guaranteed to be in A. 
This happens with probability at most n2 2n+1 /M because o is a random element of A. 

Notice that f(l,a') — /(0, a) = u. Since wi, . . . ,w n are randomly chosen, the probability that 
F(l — t, a!) and F(t, a) differ on the i'ih coordinate is at most 

< \(u,ei)\ 



Ccub^2 +2f -/ Cc ub n2 +2f 



By the union bound, the probability that F(l — t, a') ^ F(t, a) is at most 



c cu bn 



5 +2f • Hull c cub n 2f 



where we used the fact that the l\ norm of a vector is at most *Jn times its 1% norm. 

The sum of the two error probabilities n^-^ — \- c ^ n 2i is at most j- i og ( 2 M)Y ^ or Ccub ^ ar § e enough. 

■ 

This concludes the proof of Lemma 13.41 ■ 
3.3 An Improved Algorithm 

In this section we complete the proof of Theorem 11.11 The algorithm we describe has many 
similarities with the one in the previous section. The main difference is that it is based on n- 
dimensional balls instead of cubes. The idea is to construct a ball of the right radius around lattice 
points and to show that if two lattice points are close then the two balls have a large intersection 
while for any two far lattice points the balls do not intersect. For technical reasons, we will assume 
in this section that the lattice is a subset of Z n . Any lattice with rational points can be scaled so 
that it is a subset of Z n . We begin with some technical claims: 

Claim 3.7 For any R > 0, let B n be the ball of radius R centered around the origin in W 1 and let 
B' n = B n + d for some vector d be a shifted ball. Then, the relative n-dimensional volume of their 
intersection is at least 1 — 0{^/n\\d\\/ R), i.e., 

"""^"f > i - owmm- 

vol{B n ) 

Proof: Consider a point x E M n such that (x, d)/[|d|| > ||d||/2, i.e., a point which is closer to the 
center of B' n than to the center of B n . Notice that x £ B n implies x E B' n . In other words, the cap 
C n ofB n given by all such points x is contained in B n n B' n . By using a symmetric argument for 
points x £R n such that (x, d)/\\d\\ < \\d\\/2 we get, 

vol{B n PiB' n ) = 2-vol(C„). 

We can lower bound the volume of C n by half the volume of B n minus the volume of an n- 
dimensional cylinder of radius R and height [|d||/2: 

vol(C n ) > ^vol(S„) - ^volGB n -i) 

where -B n _i is the n— 1-ball of radius R. We complete the proof by using the estimate vol(S n _i)/vol(B r , 
0(Vn/JJ), 

vol(C7 n )/vol(S n ) > \ - 0(V^\\d\\/R). 



8 



In the algorithm we will actually represent the balls using points of a fine grid. Therefore, we 
would like to say that the above claim still holds if we consider the number of grid points inside 
B n , B' n and B n n B' n instead of their volumes. The following claim is more than enough for our 
needs: 

Claim 3.8 (Special case of Proposition 8.7 in |20j) Let L be an integer and consider the scaled 
integer grid j-7j n . Then, for any convex body Q that contains a ball of radius r > j^n 1 ' 5 , 



|±z«nQ| 



L n vol(Q) 



1 



2n 15 

< —r- 

rL 



Corollary 3.9 Let L = 2 n and consider the scaled integer grid jTP* . For any R > 1, let B n be 
the ball of radius R centered around the origin in M n and let B' n = B n + d for some vector d such 
that R/poly(n) < \\d\\ < R. Then, the relative number of grid points in their intersection is at least 
l-0{^\\d\\/R),i.e., 

\iz"nB n nB' n \ >i _ 

Proof: We first note that B n , B' n and B n n B' n all contain the ball of radius R/2 > 1/2 centered 
around d/2. Using Claim we obtain that the number of grid points in these bodies approximates 
their volume up to a multiplicative error of 2 ^ 2 = 2~^ n \ We complete the proof by using Claim 
1X71 ■ 

Let D(-,-) denote the trace distance between two quantum states It is known that the 

trace distance represents the maximum probability of distinguishing between the two states using 
quantum measurements. We need the following simple bound on the trace distance: 

Claim 3.10 For all k > and density matrices a\, . . . , cr^, a[, . . . , a' k , 

k 

D(cri (8) . . . (8) a k , a[ ® . . . (8) a' k ) < ^ D{a h a-) 

i=i 

Proof: Using the triangle inequality, 

D(ai (8 . . . (8 <7fc, a[ (8 . . . <g> a' k ) < D{a\ ® . . . <8 o"fc, o' x (8 o"2 ® ■ ■ ■ ® o"fe) + 

D(a[ (8 cr 2 <8 . . . (8 cr fc , a[ (8 o-' 2 (8 cr 3 (8 . . . <8 ak) + . . . 
D(a' 1 (8 ... (8 a' k _ x (8 a k , a[ (8 . . . (8 a' k ) 
= D(a h a[) + D{a 2 , a' 2 ) + . . . + D(a k , a' k ). 

■ 

In addition, we will need the following lemma: 
Lemma 3.11 For any 1 < R < 2 pol y {n \ let 

, , 1 



n B n \ x&±-z n nB n 

be the uniform superposition on grid points inside a ball of radius R around the origin where L = 2 n . 
Then, for any c > 0, a state \fj) whose trace distance from \rj) is at most l/n c can be efficiently 
computed. 



9 



Proof: In order to bound the trace distance, we will use the fact that for any two pure states 

|V>i>,hfe), , 

D(\lh), |^2» = V / l r B)F < lll^l) " IV>2>|| 2 . (1) 

The first equality appears in j^J and the inequality follows by a simple calculation. 

Consider the (continuous) uniform probability distribution q over B n . Then one can define its 



discretization q' to the grid 4-Z n as 



q'(x) = / q(y)dy 
Jx+[0,1/L] n 



for x £ jZ n , In other words, q'{x) is proportional to the volume of the intersection of B n with the 
cube x + [0, 1/L] n . Notice that for points x such that x + [0, 1/L] n is completely contained in B n , 
q'(x) = l/(L n vol(B n )). We claim that the state 

is exponentially close to \r/). Intuitively, this holds since the two differs only on points which are 
very close to the boundary of the ball, namely, of distance ^fnjL from the boundary. The number 
of such points is negligible compared to the number of points in the interior of the ball. More 
formally define 

\n")~ / Lnvol (£n) ■ n 

lv) -^\i^nB n \ lv) - 

Using Equation ^ 

d(\v), W)) < WW) - \v)h < WW) - W)h + WW) - \v)h. 

The first term is at most 2~^ n ^ according to Claim 13,81 For the second term, notice that the 
amplitudes of \rj") and |ry) are the same except possibly on points x of distance y/n/L from the 
boundary. Using Claim 13.81 again we get that the fraction of such points is closely approximated 
by one minus the ratio of volumes of the ball of radius R — y/n/L and the ball of radius R. This 
ratio of volumes is (1 - ^n/(RL)) n > (1 - y/n/L) n > 1 - n^/L = 1 - 2~ Q H 

In the following we show how to approximate the state \rf). This idea is essentially due to Grover 
and Rudolph Let mGZbe large enough so that B n is contained in the cube [— 2 m , 2 m ] n . Using 
our assumption on R, m < n Cl for some c\ > 1. We represent x using K = n(m+l+logL) < 2n 1+Cl 
qubits, i.e., a block of m + 1 + logL qubits for each dimension. Hence, we can write \r]') as 

W) = E vVOzi,--- ,x K )\xi,...,x K ). 
xi,...,x K e{o,i} 

We now show an equivalent way of writing \rf). Let us extend the definition of q' in the 
following way: for any k < K and any x\,...,Xk £ {0,1} define q'(xi, . . . , Xk) as the sum of 
q'(xi, . . . , Xk, Xk+i, ■ ■ ■ , xk) over all sequences Xfc+i, . . . ,xk £ {0,1}. Notice that q'(x\, . . . 
corresponds to the volume of the intersection of B n with a certain cuboid (also known as a rect- 
angular parallelepiped). For example, q'(0) = q'(l) = \ since they represent the intersection of 
B n with two halves of the cube [_2 m ; 2 m ] n . Using the definition s(xi) = q'{x\) and for k > 1, 
s(xi, ...,x k ) = q'(xi, . . .,x k )/q'(x 1 , . . . ,x k -i) we see that 

W) = E V s ( x i) E V s ( x i> x 2) ■ ■ ■ X/ V s ( x ii ■ ■ -,xk)\xi, . . -,x K ). 

x-ie{0,l} a;2S{0,l} x K £{0,l} 
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The algorithm starts with all K qubits in the state |0) and sets one qubit at a time. The first 
qubit is rotated to the state ~^(|0) + Assume we are now in the fc'th step after setting the 
state of qubits 1, . . . , k— 1. We use the fact that there exists a classical algorithm for approximating 
the volume of a convex body up to any l/poly(n) error (see |15| and references therein). The body 
should be provided by a "well-guaranteed weak membership oracle", i.e., a sphere containing the 
body, a sphere contained in the body, both of non-zero radius and an oracle that given a point 
decides if it is inside the body or not. It is easy to construct such two spheres and an oracle 
for a body given by the intersection of a ball with a cuboid. Hence, we can compute two values 
s(xi, . . . , Xfc-i, 0) and s(x\, . . . , Xk-i, 1) such that 



s(xi, . . . ,x k -i,0) + . . .,Xjfc_l, 1) 

and 

s(x 1 ,...,Xk-i,i) . 

1 < n 



s(xi, . . . ,Xk-i,i) 

for i = 0, 1 and some constant C2 which will be chosen later. Then, we rotate the i'ih qubit to 
the state y s(x±, . . . , x k -i, 0) 1 0) + yjs{x\, . . . ,Xk-i, 1 ) 1 1 ) - This completes the description of the 
procedure. 

Notice that the amplitude of each basis state \x\, . . . , xk) in the resulting state \ fj) is given by 

K K 
Yl y / s{x 1 ,...,x k ) > (1 - n~ C2 ) K Yl y/s(xi,...,x k ). 

k=l k=l 

Hence the inner product {fj\ri') is at least 

K 

(l-n-^K J- Y[s(x 1; ...,x k ) = (l-n- c *) K q'(x u ...,x K ) 
xi,...,x K e{o,i} fc=l x 1 ,...,x K e{o,i} 



(1 - n~ C2 ) K >l-K- n~ C2 > 1 - 2ra 1+Cl ~ C2 . 



Using Equation^ 



D(\r,'),\fj)) = y/l-\(f}\rf)\*<n- c 
for a large enough C2- ■ 

Let p > n 2+2f be any fixed prime. The following is the main lemma of this section. It essentially 
replaces Lemma 13.41 and hence implies Theorem ll.il 

Lemma 3.12 For any f > let u = X^ILi u ^ ) i be the shortest lattice vector in a (c unq n2 +2f ) -unique 
lattice where c unq > is a constant. If there exists a solution to the two point problem with failure 
parameter f then there exists a quantum algorithm that given this lattice and three integers l,m,io 
returns (ui, . . . ,Uj _i, — ,Uj +i, . . . ,u n ) with probability l/poly(n) if the following conditions 
hold: \\u~\\ < I < 2||«||j Uj = m (mod p) and 1 < m < p — 1. 

Proof: As before, let . . . ,b n ) be an LLL reduced basis, let M = 2 4n and assume that io = 1. 
We also define f(t,a) as before. Assume that the number of registers needed by the two point 
algorithm is at most n Cl for some constant c\ > 0. 

The algorithm starts by calling the routine of Claim 15*. Ill n Cl times with accuracy parameter 
n~ C2 and R = c^ a \n2 +2f ■ I for some constants C2, Cb a i > 0. The state we obtain is 

I771) (8) .. . (8) \fi n n) (2) 
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where each \fji) has a trace distance of at most n C2 from \rj). According to Claim IT. 1UI the above 



tensor product has a trace distance of at most n Cl 02 from \rj) ,i9n 1 . In the following we show that 



the algorithm succeeds with probability at least n~ cz for some C3 > given the state I77) 
This would complete the proof since given the state in Equation [2J the algorithm succeeds with 
probability at least n~ cs — n Cl ~° 2 > \n~ C3 for large enough C2- 

We describe a routine that given the state creates one register in the input to the two point 
problem. In order to produce a complete input to the two point problem, the algorithm calls this 
routine n Cl times, each time with a new \rj) register. It then calls the two point algorithm and 
outputs the result. As required, the success probability is \jpoly(n log M) = n~ Cz for some C3 > 0. 

Given \r/), the routine creates the state 

vm £ M)0|r?) ' 

VZiW te{0,l},a 6 ^ 

or equivalently, 

^2 \t,a,x) 
te{o,l},a,&A,xe^z n nB n 

where B n is the ball of radius R around the origin and L = 2 n . We add the value f(t, a) to the last 
register, 

^2 \t,a,f(t,a) + x). 

tE{o,l},aEA,zE~Z n nB n 

Finally, we measure the last register and if x! denotes the result, the state collapses to 

^2 \t,a,x). 

te{0,l},a£A\x'£f(t,a)+j-Z n r\B n 

Claim 3.13 For every x' , there is at most one element of the form (0, a) and at most one element 
of the form (1, a!) such that x' £ f(t, a) + j^U 1 T\B n . Moreover, if there are two such elements (0, a) 
and (1, a!) then a! — a is the vector ( Ml ~ m , U2, ■ ■ ■ , u m ). 

Proof: Consider two different lattice points in the image of/, v = f(t,a) and v' = f(t',a'), such that 
x' is both in v + jZ n r\B n and v' + ^Z n nB n . This implies that < c ba] n^ +2f • I < 2c ba \n^ +2f • 

||n||. For c unq > 2c ba \ this means that v' — v = k-u for some integer k ^ 0. As before, by considering 
the first coordinate of v' — v in the lattice basis we get that (a^p+t'm) — (aip + tm) = k-m (mod p). 
Hence, k = t' — t (mod p). If t = t' then k = (mod p) and therefore \k\ > p which contradicts the 
above upper bound on the distance between v and v' . This proves the first part of the claim. For 
the second part, let t = and t' = 1. Then, k = 1 (mod p). As before, this can only happen when 
k = 1 and hence the second part of the claim holds. ■ 

Notice that the probability of measuring x' is the same as that obtained by first choosing random 
t and a and then choosing a random point in f(t, a) + -^Z" n B n . Let us define for any t and a the 
vector a' as before. 

Claim 3.14 With probability at least 1 — j- i og ^ 2 M)) 1 ' f or ran domly chosen t and a and a random 
point x' in f(t, a) + jll 1 H B n , a' is in A and x' is also in /(l — t, a') + j7L n fl B n . 
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Proof: According to Lemma 13.31 \ui\ < 2 2n . Hence, unless there exists an i for which < 2 2n 
or a-i > M — 2 2n , a' is guaranteed to be in A. This happens with probability at most n2 2n+1 /M 
because a is a random element of A. 

Fix a, a' G A. We would like to show that if x' is chosen uniformly from f(t, a) + \ r ^ x H B n then 
with high probability it is also in f(l — t, a!) + j i r L n C\B n . By translating both sets by —f(t, a) we get 
the equivalent statement that if x! is chosen uniformly from -^Z 71 n B n then with high probability 
it is also in (/(l — t, a') — f(t, a)) + j7L n n B n . Since we assumed that our lattice is a subset of Z n , 
f(l-t,a') -f(t, a) G Z n and the latter set equals \7P n (/(l - 1, a') -f(t, a) + B n ). Using Corollary 
13.91 and the fact that ||/(1 — t, a') — f(t,a)\\ = \\u\\ < I, we get that the required probability is at 
least 

1 - 0(V^l/R) = 1 - 0(V^/(c b ai^ +2f • /)) = 1 - 0(l/(c ba ,n 2f )). 

The sum of the two error probabilities n^-j^- — h 0(l/(cb a \n 2f )) is at most ( n log^M)) 17 ^ or Cbal 
large enough. ■ 

This concludes the proof of Lemma 13.121 ■ 



4 The Dihedral Coset Problem 

We begin this section with a description of the average case subset sum problem. We describe our 
assumptions on the subroutine that solves it and prove some properties of such a subroutine. In 
the second subsection we present an algorithm that solves the DCP with calls to an average case 
subset sum subroutine. 

4.1 Subset Sum 

The subset sum problem is defined as follows. An input is a sequence of numbers A = (a%, . . . , a r ) 
and two numbers i, N. The output is a subset B C [r] such that ai = t (mod N). Let a legal 

input be an input for which there exists a subset B with Yli^B a i = * (mod N). For a constant 
c r > 0, we fix r to be log N + c r since we will only be interested in such instances. First we show 
that there are many legal inputs: 

Lemma 4.1 For randomly chosen a%, . . . ,a r ,t in {0, . . . , N — 1}, the probability that there is no 
B C [r] such that ^«g_b a i=t (mod N) is at most ^. 

Proof: Fix a value of t. Define a random variable for every b G {0, l} r , b ^ r as 1 if Y2i ha>i 
I (mod N) and otherwise. Since for every b the sum biai has any value modulo N with the 
same probability, the expectation of is and its variance is — < j^. Hence, 

Given two different sequences b, V G {0, l} r we show that X^ and Xy are independent. Let i be 
such that bi ^ b[ and assume without loss of generality that bi = \,b\ = and i = 1. Then, 

Pr ai ,..., ar [X 5 = 1 A X 5 , = 1] = £ a2) ... !ar [Pr 01 [X 6 = 1 A Xy = 1]] 

= £a 2 ,...,a r [l/A^X- 6 ,,l] 

= Pr au ..., ar [X- b = l]Pr ai _ ar [Xy = l] 
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where the second equality holds because Xy does not depend on a\ and is 1 with probability 
1/N for any ct2, . . . ,a r . A similar argument holds for other values of X^ and Xy. Therefore, the 
random variables are pairwise independent and by the Chebyshev bound, 



rv^ 1 2 r -1 



2 r -l, N 8 

< — . 



N 1 ~ 2 r - 1 ~ 2 C ' 

b 

In particular, the probability of Yli-^-b = 0) that is, the probability that there is no B such that 
J2ieB a i = t (mod N) is at most ^ = ^ for c r = 4. ■ 

We assume that we are given a subroutine that answers a log c s jy fraction of the legal subset 
sum inputs with parameter TV where c s > is any constant. As can be seen from the previous 
lemma, this implies that the subroutine answers a non-negligible fraction of all inputs (and not 
just the legal inputs). In addition, we assume that the subroutine is deterministic. We denote by 
S(A, t) the result of the subroutine S on the input A = (a±, . . . , a r ),t and we omit N. This result 
can either be a set or an error. Let S(A) denote the set of i's for which the subroutine returns a 
set and not an error, i.e., S(A) = {t \ S(A,t) / error}. 

Corollary 4.2 For randomly chosen a\, . . . , a r in {0, . . . , iV— 1}, Pr^O^-A)! > 4 loj ^s N ] = ^( \ og ^ N ) 
where A = (a± , . . . , a r ) . 

Proof: Since S(A, t) 7^ error only when (A, t) is a legal input, 
PrA,t[S(A,t) / error] = Pr A ,t[ S(A,t) / error A (A,t) is legal ] 

= Pr A ,t[ S(A, t) + error \ (A, t) is legal ] • Pr A ,t[ (A t) is legal ] > 



2 log Cs N 
In addition, 

Pr A ,t[S(A,t) error] = E A [ ] 

< Pr A [\S(A)\>—^—]+Pr A [\S(A)\< * ' ' 



< PrA[\S(A)\>— s —] + 



41og Cs /V J 41og Cs /V J 41og Cs /V 

N , 1 



4 log Cs N 1 4 log Cs N ' 

By combining the two inequalities we obtain the corollary. ■ 

Lemma 4.3 Let T C {0, ... ,N — 1} be a set such that \T\ > -j /or a certain s. Then, for any 
q < there exists q' G {q, 2q, . . . , sq} such that the number of pairs t,t + q' that are both in T is 

Proof: Define the partition of T into sets To, ... , T q ^i as 

Tk = {i \ i £ T,i = k (mod q)}. 

At least ^ of the sets are of size at least ^ since their union is T and + ^ < \T\. Let Tj be 
such a set and for t G Tj consider the values t + g, i + 2g, . . . , t + 4sg. Therefore, the number of t G Tj 
such that none of these values is in T{ is less than ^ because \{i | < i < N, i = k (mod q)}\ = 
Therefore, more than |Tj| — ^ > ^ of the elements t G Tj are such that one of t+q,t+2q, . . . , t+4sg 
is also in Tj. Summing over all sets Tj such that |Tj| > there are at least • ^ = ^ elements 
i G T for which one oft + q,t + 2q,...,t + 4sq is also in T. Thus, there exists a q' G {g, 2q, . . . , 4sq} 
such that the number of £ G T for which t + (/ G T is at least ■ 
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Definition 4.4 A partial function f : {0, . . . , N — 1} — ► {0, . . . , N — 1} is called a matching if for 
all i such that f{i) is defined, f(i) ^ i and f(f(i)) = i. A matching is a q-matching if for all i 
such that f{i) is defined, \ f{i) — i\ = q. We define an equal partition of the domain of a matching 
f by A x (f) = {i | f(i) defined A f(i) > i} and A 2 {f) = {i \ f(i) defined A f{i) < i}. The 
intersection of a matching f and a set T C {0, . . . , N — 1} is the set {i \ i £ T A f(i) £ T}. 

For any q we define the following g-matchings: 



t + q t mod 2q < q, t + q < N, ( t- q t mod 2q < q, t - q > 0, 

f*(t) = { t-q t mod 2q > q, t - q > 0, f q (t) = < t + q t mod 2q > q, t + q < N, 

undefined otherwise. ^ undefined otherwise. 

Lemma 4.5 There exists a constant c m such that for any integer q < log ^, N there exists a matching 
f among the 2 log Cm N matchings f£ q , . . . , /^ gCm Nq , f q , f$ q , . . . , /, 2 og c m Nq such that with probability 
at least log in N on the choice of A, the intersection of f and S(A) is log ^ N ■ We call such an f a 
good matching. 

Proof: According to Corollary 14.21 4 N of the possible values of A satisfy |5(A)| > 4 log ^ 5 N ■ For 
such A, Lemma 14.31 with s = 41og Cs N implies that there exists a value q' G {q> 2q, . . . , 41og Cs • q} 
such that the number of pairs t,t + q' that are both in S(A) is fl( lQ ^. s N ). Therefore, for such A 

and q' , the size of the intersection of one of the matchings f qf ,f q i and S(A) is ^( log 3^ s N ). This 

implies that one of the 81og Cs N matchings considered must have an intersection of size {l( lo J^. s N ) 

with at least 32 i og ^e 5 N of the possible values of A. We conclude the proof by choosing c m > 3c s . ■ 

4.2 The Quantum Algorithm 

We begin with the following simple claim: 

Claim 4.6 For any two basis states \a) and \b) , a ^ b, there exists a routine such that given the 
state \a) + e(4>)\b) outputs the state |0) +e(0)|l). 

Proof: Consider the function / defined as f(a) = 0, /(0) = a,f(b) = 1,/(1) = b and f(i) = i 
otherwise. It is reversible and can therefore be implemented as a quantum routine. ■ 

We now describe the main routine in the DCP algorithm. 

Lemma 4.7 There exist routines R\,R2 such that given a q-matching f and an input for the DCP 
with failure parameter 1, they either output a bit or they fail. Conditioned on non-failure, the 
probability of the bit being 1 is | — ^ cos(27rg-^) for R\ and \ + \ sm(2irq-^) for i?2- Moreover, if 
f is a good matching, the success probability is ^( log cm jy )- 

Proof: The routines begin by performing a Fourier transform on the last log N qubits of each input 
register. Consider one register. Assuming it is a good register, the resulting state is 

N-l ., N-l 



= Y e(ix/N)\0,i) + ^L= Y e{i(x + d)/N)\l, 
2N V2N 

N-l 

=r£e(ix/N)(\Q) + e(id/N)\l))\i). 



i=0 
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We measure the last log N qubits and let a € {0, . . . , N — 1} be the result. The state collapses to 



±e(ax/N)(\0) +e(ad/N)\l))\a). 

If it is a bad register, it is in the state \b,x) where both b and x are arbitrary. After the Fourier 
transform the state is -^Yld=^ e (i x /N)\b,i) and after measuring a in the last logiV qubits, the 
state is e(ax/N)\b,a). Notice that in both cases any value a in {0, . . . , N — 1} has an equal 
probability of being measured. 

We choose the number of input registers to be r. Let A = (a±, . . . , a r ) be the sequence of values 
measured in the above process. Notice that this sequence is uniform and hence can be used as 
an input to the average case subset sum algorithm. In the following, we assume that s of the r 
registers are bad. Later we will claim that with good probability, none of the registers is bad. Yet, 
we have to show that even if one of the registers is bad, the routine does not return erroneous 
results. Without loss of generality, assume that the first s registers are bad. The resulting state is: 

®[e{aiXi/N)\bi,ai)] (g) [^=e( ai x t /N)(\0) + e(a t d/N)\l))\a z )]. 

i=l i=s+l V 2 

Or, by omitting the multiplication by the fixed phase and the r • [log N] fixed qubits, 

(g) [^(|0> + eM/A0|l))]. 

i=l i= s +l v 1 

Denote these r qubits by a = (a±, . . . , a r ). 

We add r + 1 new qubits, (3 = (fti, . . . , (3 r ) and 7. Let t a denote the sum X)i=i a i a i- Next, we 
perform the following operations: 



it S(A,t s ) ^ a V S(A,f(t s )) = error 

then exit 
if * a € Ai(/) 

'/3<-a 

else if ta € A 2 (f) 



then 



1 7 «- 1 



else exit 



In order to describe the state after the above procedure, we define the following subsets of 
{0,l} r : 

M = {a e {0, l} r I a x = h, . . . , a s = b s } 
L = {ae M I t a € A^f) A S(A, t 5 ) = a A S(A, f(t a )) + error} 
R = {aeM\t £t e A 2 (f) A S(A, t s ) = a A S(A, f(t s )) ± error} 
Using the order \a,/3, 7), the resulting state is: 



v ^ adM-L-R 
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^e((a,o}— )\a,a,l) + ^ e((a, a)^)\a, S(A, /(*«)), 1}) 
( E e((«,«)4)l«' '°) + 



v aeM-L-R 
<5GL 

= 7=( E e((a,aA|a,0,0> + 

V aeM-L-R 

Y, e«S > a>^)(|a> + e(g • |:)|5(A 1)) 

Now we measure /3 and 7. If 7 = 0, the routine failed. Otherwise, the state of a is (omitting 
the fixed {3 and 7): 

^m + e(q-^)\S(AJ(t )))). 

Notice that since (3 is known and S(A, f(ts)) can be easily found by calling S, we can transform 
this state to the state 

^(|0>+e( ? 4)|l)) 

by using Claim FQ1 By omitting some qubits, we can assume that this is a state on one qubit. By 
using the Hadamard transform the state becomes 

l((l + e (g|))|0) + (l-e(^))|l}). 

We measure the qubit and the probability of measuring 1 is 

1. ,<i. l9 1 , , d . , 1 1 . d . 

- 1 - e(q— 2 = -(2 - 2cos(2vrg— ) = cos 2vrg— . 

4 1 y N 4 N 2 2 N 

This completes the description of R\. The routine R2 applies the transform 

1 

i 

before the Hadamard transform and thus the state becomes 

+ e(l/4 + ,1))|0) + (1 - e(l/4 + g^))|l» 

and the probability of measuring 1 becomes \ — \ cos(7r/2 + 2nq-^) = 5 + 5 sm(2irqj^). 

From the previous description, it is clear that the probability of measuring 1 conditioned on 
a non-failure is correct. Thus, it remains to prove that when / is a good matching the failure 
probability is low. The success probability equals the probability of measuring 7 = 1 which is 
\L U R\/2 r ~ s . Assume that none of the r registers is bad. Then, \L U R\/2 r ~ s = \L U R\/2 r and 
L U R becomes {a £ {0, l} r | t & G A x (f) U A 2 (f) A S(A, t & ) = a A S(A, f(t & )) + error}. Notice 
that the size of this set equals \{t \ t G S(A) A f(t)£ S(A)}\ which, according to the definition of 
a good matching, is at least log ^i N ■ Therefore the probability of success conditioned on all of the 
registers being good is \L U R\/2 r = 2Cr lo ^ em N = i]( logC |„ N ). This concludes the proof since with 
probability at least (1 — lo ^ N ) r = (1 — lo g N ) log N+c ' = ^2(1) none of the registers is bad. ■ 
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Claim 4.8 Given an approximation x of sin (j) and an approximation y of cos cj> with additive error 
e, we can find 4> mod 2tt up to an additive error of 0(e). 



Proof: Assume y > and let z = A simple calculation shows that z is an estimate of j^^t^ 
up to an additive error of at most 4e. The estimate on <f> is 2arctanz. Since the absolute value of 
the differential of arctan is at most 1, this is an estimate of 2 arctan( jj^^g ) = <fi with an additive 

error of at most 8e. When y < we compute an estimate of 2arccot( = <f>. ■ 

Lemma 4.9 There exists a routine R% such that with probability exponentially close to 1, given 
any q < log ^ N finds a value q' G {q, . . . , log Cm N ■ q} and an estimate x such that x G [q'd — 
A »</d+ wOTf] (mod AT). 



logCm + l jyl L l u '~ l gCm + l J 

Proof: Assume we are given a g'-matching /. We call routines R\ and R2 log 3Cm+4 N times. If the 
number of successful calls to one of the routines is less than log 2Cm+3 N, we fail. Otherwise, let 
x G [0, 1] be the average of the successful calls to R\ and y G [0, 1] be the average of the successful 
calls to i?2- According to the Chernoff bound, 

Pr[\x-(- - I C os(2^'-))| > Lp— ] < 2e- 21o s 2cm+3jv /^ 2lo s 2cm+2 ^ 

U y 2 2 y q N >n c e log Cm+1 A^ J 



which is exponentially low in log N for any constant c e > 0. A similar bound holds for y. Hence, we 

'jj) and of sin(27rg'^; 



can assume that x' = 1 — 2x and y' = 2y — 1 are approximations of cos(27rg'4|) and of sinftirq' 4?) 



respectively up to an additive error of - ^ <L+i N - According to Claim this translates to an 
estimate of q' jj mod 1 with an additive error of t cm+i N fo r c e large enough. 

By repeating the above procedure with all the matchings that appear in Lemma 14.51 we are 
guaranteed to find a good matching. According to Lemma 14.71 a call to routine R\ or to routine 
i?2 with a good matching succeeds with probability at least c g ^ } m N for a certain c g > 0. The 

probability that none of log Cm+1 calls to the subroutine succeeds is (1 — c g t cm jv ) log m+ N which 
is exponentially small. Thus, for one of the matchings, with probability exponentially close to 1 we 
have log 2Cm+3 N successful calls to routines R\ and R2 and routine -R3 is successful. ■ 

We conclude the proof of Theorem II ..SI with a description of the algorithm for finding d. We 
begin by using routine R3 with the value 1 to obtain an estimate x\ and a value q < log Cm N 
such that x% G [d' — c ^ + i , d' + c ^ +l ] (mod N) where d' denotes (dq mod N). In the 
following we find d' exactly by calling R% with multiples of q. The algorithm works in stages. 
In stage i we have an estimate x% and a value q\. The invariant we maintain is X\ G [qid' — 
iog e '^+ 1 n ' log cm+i jy ] (mod qiN) . We begin with x\ as above and q\ = 1. Assume that the 

invariant holds in stage i. We use routine R3 with the value 2qiq to obtain an estimate x with a 
value q' G {2qiq,4qiq, . . . 21og Cm N-qiq} such that x G [qi+id! — logCm +i N , Qi+l d' + los cn+i N ] (mod N) 

where qi + \ = q'/q. Notice that our previous estimate X{ satisfies ^^-Xi G [qi+\d' — ^^ ,qi+id' + 
i™ N ] (mod qi+iN). Since this range is much smaller than N, we can combine the estimate x on 
(qi+id' mod N) and the estimate on (qi+\d' mod qi + \N) to obtain Xi + \ such that Xi + \ G 

fa+ld' - log c^+i N , qi+id' + logem v +ljV ] (mod q i+ iN). The last stage is when qi > ^ct+i N ■ Th en, d! 
can be found by rounding — to the nearest integer. Given d' there are at most q < log Cm A^ possible 
values for q. Since this is only a polynomial number of options we can output one randomly. 
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